Robust Value Function Approximation by Working Backwards

نویسندگان

  • Justin A. Boyan
  • Andrew W. Moore
چکیده

In this paper, we examine the intuition that TD( ) is meant to operate by approximating asynchronous value iteration. We note that on the important class of discrete acyclic stochastic tasks, value iteration is ine cient compared with the DAG-SP algorithm, which essentially performs only one sweep instead of many by working backwards from the goal. The question we address in this paper is whether there is an analogous algorithm that can be used in large stochastic state spaces requiring function approximation. We present such an algorithm, analyze it, and give comparative results to TD on several domains.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Function Approximation Approach for Robust Adaptive Control of Flexible joint Robots

This paper is concerned with the problem of designing a robust adaptive controller for flexible joint robots (FJR). Under the assumption of weak joint elasticity, FJR is firstly modeled and converted into singular perturbation form. The control law consists of a FAT-based adaptive control strategy and a simple correction term. The first term of the controller is used to stability of the slow dy...

متن کامل

Learning Evaluation Functions for Large Acyclic Domains

Some of the most successful recent applications of reinforcement learning have used neural networks and the TD( ) algorithm to learn evaluation functions. In this paper, we examine the intuition that TD( ) operates by approximating asynchronous value iteration. We note that on the important subclass of acyclic tasks, value iteration is ine cient compared with another graph algorithm, DAG-SP, wh...

متن کامل

Robust adaptive control of voltage saturated flexible joint robots with experimental evaluations

This paper is concerned with the problem of design and implementation a robust adaptive control strategy for flexible joint electrically driven robots (FJEDR), while considering to the constraints on the actuator voltage input. The control design procedure is based on function approximation technique, to avoid saturation besides being robust against both structured and unstructured uncertaintie...

متن کامل

An Alternative Stability Proof for Direct Adaptive Function Approximation Techniques Based Control of Robot Manipulators

This short note points out an improvement on the robust stability analysis for electrically driven robots given in the paper. In the paper, the author presents a FAT-based direct adaptive control scheme for electrically driven robots in presence of nonlinearities associated with actuator input constraints. However, he offers not suitable stability analysis for the closed-loop system. In other w...

متن کامل

An Alternative Stability Proof for Direct Adaptive Function Approximation Techniques Based Control of Robot Manipulators

This short note points out an improvement on the robust stability analysis for electrically driven robots given in the paper. In the paper, the author presents a FAT-based direct adaptive control scheme for electrically driven robots in presence of nonlinearities associated with actuator input constraints. However, he offers not suitable stability analysis for the closed-loop system. In other w...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 1995